Goto

Collaborating Authors

 attitude control


Deep reinforcement learning-based spacecraft attitude control with pointing keep-out constraint

Yang, Juntang, Ben-Larbi, Mohamed Khalil

arXiv.org Artificial Intelligence

This paper implements deep reinforcement learning (DRL) for spacecraft reorientation control with a single pointing keep-out zone. The Soft Actor-Critic (SAC) algorithm is adopted to handle continuous state and action space. A new state representation is designed to explicitly include a compact representation of the attitude constraint zone. The reward function is formulated to achieve the control objective while enforcing the attitude constraint. A curriculum learning approach is used for the agent training. Simulation results demonstrate the effectiveness of the proposed DRL-based method for spacecraft pointing-constrained attitude control.


Dimension-Decomposed Learning for Quadrotor Geometric Attitude Control with Almost Global Exponential Convergence on SO(3)

Gao, Tianhua, Izumita, Masashi, Tomita, Kohji, Kamimura, Akiya

arXiv.org Artificial Intelligence

This paper introduces a lightweight and interpretable online learning approach called Dimension-Decomposed Learning (DiD-L) for disturbance identification in quadrotor geometric attitude control. As a module instance of DiD-L, we propose the Sliced Adaptive-Neuro Mapping (SANM). Specifically, to address underlying underfitting problems, the high-dimensional mapping for online identification is axially ``sliced" into multiple low-dimensional submappings (slices). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional subtasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without the persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the state solution of the rotational error dynamics exponentially converges into an arbitrarily small ball within an almost global attraction domain, despite time-varying disturbances and inertia uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unseen disturbances and specific knowledge of the model. To our knowledge in the quadrotor control field, DiD-L is the first online learning approach that is lightweight enough to run in real-time at 400 Hz on microcontroller units (MCUs) such as STM32, and has been validated through real-world experiments.


Robust and Agile Quadrotor Flight via Adaptive Unwinding-Free Quaternion Sliding Mode Control

Yazdanshenas, Amin, Faieghi, Reza

arXiv.org Artificial Intelligence

--This paper presents a new adaptive sliding mode control (SMC) framework for quadrotors that achieves robust and agile flight under tight computational constraints. The proposed controller addresses key limitations of prior SMC formulations, including (i) the slow convergence and almost-global stability of SO(3)-based methods, (ii) the oversimplification of rotational dynamics in Euler-based controllers, (iii) the unwinding phenomenon in quaternion-based formulations, and (iv) the gain overgrowth problem in adaptive SMC schemes. Our controller is computationally efficient and runs reliably on a resource-constrained nano quadrotor, achieving 250 Hz and 500 Hz refresh rates for position and attitude control, respectively. In an extensive set of hardware experiments with over 130 flight trials, the proposed controller consistently outperforms three benchmark methods, demonstrating superior trajectory tracking accuracy and robustness with relatively low control effort. The controller enables aggressive maneuvers such as dynamic throw launches, flip maneuvers, and accelerations exceeding 3g, which is remarkable for a 32-gram nano quadrotor . The experimental codes and videos related to this paper are accessible at the following links: Code: https://github.com/A A. Motivation Quadrotors require robust control to maintain stability and precise maneuverability under disturbances and uncertainties. One widely studied method in this context is sliding mode control (SMC). One key challenge involves attitude control. As discussed in Section II, coordinate-free methods exhibit slow convergence and provide only almost global stability. The authors are with the Autonomous V ehicles Laboratory, Department of Aerospace Engineering, Toronto Metropolitan University, Toronto, Canada{amin.yazdanshenas,reza.faieghi Quaternion-based methods also face the unwinding issue, which can cause unnecessarily prolonged rotations. A second challenge is the need to know the upper bounds of uncertainties. Adaptive switching gains eliminate the need for prior knowledge of these bounds.


Intelligent Control of Spacecraft Reaction Wheel Attitude Using Deep Reinforcement Learning

El-Dalahmeh, Ghaith, Jabbarpour, Mohammad Reza, Vo, Bao Quoc, Kowalczyk, Ryszard

arXiv.org Artificial Intelligence

Reliable satellite attitude control is essential for the success of space missions, particularly as satellites increasingly operate autonomously in dynamic and uncertain environments. Reaction wheels (RWs) play a pivotal role in attitude control, and maintaining control resilience during RW faults is critical to preserving mission objectives and system stability. However, traditional Proportional Derivative (PD) controllers and existing deep reinforcement learning (DRL) algorithms such as TD3, PPO, and A2C often fall short in providing the real time adaptability and fault tolerance required for autonomous satellite operations. This study introduces a DRL-based control strategy designed to improve satellite resilience and adaptability under fault conditions. Specifically, the proposed method integrates Twin Delayed Deep Deterministic Policy Gradient (TD3) with Hindsight Experience Replay (HER) and Dimension Wise Clipping (DWC) referred to as TD3-HD to enhance learning in sparse reward environments and maintain satellite stability during RW failures. The proposed approach is benchmarked against PD control and leading DRL algorithms. Experimental results show that TD3-HD achieves significantly lower attitude error, improved angular velocity regulation, and enhanced stability under fault conditions. These findings underscore the proposed method potential as a powerful, fault tolerant, onboard AI solution for autonomous satellite attitude control.


Deep Reinforcement Learning Policies for Underactuated Satellite Attitude Control

Hariry, Matteo El, Cini, Andrea, Mellone, Giacomo, Balossino, Alessandro

arXiv.org Artificial Intelligence

Autonomy is a key challenge for future space exploration endeavours. Deep Reinforcement Learning holds the promises for developing agents able to learn complex behaviours simply by interacting with their environment. This paper investigates the use of Reinforcement Learning for the satellite attitude control problem, namely the angular reorientation of a spacecraft with respect to an in- ertial frame of reference. In the proposed approach, a set of control policies are implemented as neural networks trained with a custom version of the Proximal Policy Optimization algorithm to maneuver a small satellite from a random starting angle to a given pointing target. In particular, we address the problem for two working conditions: the nominal case, in which all the actuators (a set of 3 reac- tion wheels) are working properly, and the underactuated case, where an actuator failure is simulated randomly along with one of the axes. We show that the agents learn to effectively perform large-angle slew maneuvers with fast convergence and industry-standard pointing accuracy. Furthermore, we test the proposed method on representative hardware, showing that by taking adequate measures controllers trained in simulation can perform well in real systems.


Tethered Variable Inertial Attitude Control Mechanisms through a Modular Jumping Limbed Robot

Tanaka, Yusuke, Zhu, Alvin, Hong, Dennis

arXiv.org Artificial Intelligence

This paper presents the concept of a tethered variable inertial attitude control mechanism for a modular jumping-limbed robot designed for planetary exploration in low-gravity environments. The system, named SPLITTER, comprises two sub-10 kg quadrupedal robots connected by a tether, capable of executing successive jumping gaits and stabilizing in-flight using inertial morphing technology. Through model predictive control (MPC), attitude control was demonstrated by adjusting the limbs and tether length to modulate the system's principal moments of inertia. Our results indicate that this control strategy allows the robot to stabilize during flight phases without needing traditional flywheel-based systems or relying on aerodynamics, making the approach mass-efficient and ideal for small-scale planetary robots' successive jumps. The paper outlines the dynamics, MPC formulation for inertial morphing, actuator requirements, and simulation results, illustrating the potential of agile exploration for small-scale rovers in low-gravity environments like the Moon or asteroids.


Modeling and In-flight Torso Attitude Stabilization of a Jumping Quadruped

Papadakis, Michail, Olsen, Jørgen Anker, Poulakakis, Ioannis, Alexis, Kostas

arXiv.org Artificial Intelligence

This paper addresses the modeling and attitude control of jumping quadrupeds in low-gravity environments. First, a convex decomposition procedure is presented to generate high-accuracy and low-cost collision geometries for quadrupeds performing agile maneuvers. A hierarchical control architecture is then investigated, separating torso orientation tracking from the generation of suitable, collision-free, corresponding leg motions. Nonlinear Model Predictive Controllers (NMPCs) are utilized in both layers of the controller. To compute the necessary leg motions, a torque allocation strategy is employed that leverages the symmetries of the system to avoid self-collisions and simplify the respective NMPC. To plan periodic trajectories online, a Finite State Machine (FSM)-based weight switching strategy is also used. The proposed controller is first evaluated in simulation, where 90 degree rotations in roll, pitch, and yaw are stabilized in 6.3, 2.4, and 5.5 seconds, respectively. The performance of the controller is further experimentally demonstrated by stabilizing constant and changing orientation references. Overall, this work provides a framework for the development of advanced model-based attitude controllers for jumping legged systems.


Model-Free versus Model-Based Reinforcement Learning for Fixed-Wing UAV Attitude Control Under Varying Wind Conditions

Olivares, David, Fournier, Pierre, Vasishta, Pavan, Marzat, Julien

arXiv.org Artificial Intelligence

This paper evaluates and compares the performance of model-free and model-based reinforcement learning for the attitude control of fixed-wing unmanned aerial vehicles using PID as a reference point. The comparison focuses on their ability to handle varying flight dynamics and wind disturbances in a simulated environment. Our results show that the Temporal Difference Model Predictive Control agent outperforms both the PID controller and other model-free reinforcement learning methods in terms of tracking accuracy and robustness over different reference difficulties, particularly in nonlinear flight regimes. Furthermore, we introduce actuation fluctuation as a key metric to assess energy efficiency and actuator wear, and we test two different approaches from the literature: action variation penalty and conditioning for action policy smoothness. We also evaluate all control methods when subject to stochastic turbulence and gusts separately, so as to measure their effects on tracking performance, observe their limitations and outline their implications on the Markov decision process formalism.


A Ducted Fan UAV for Safe Aerial Grabbing and Transfer of Multiple Loads Using Electromagnets

Yin, Zhong, Pei, Hailong

arXiv.org Artificial Intelligence

In recent years, research on aerial grasping, manipulation, and transportation of objects has garnered significant attention. These tasks often require UAVs to operate safely close to environments or objects and to efficiently grasp payloads. However, current widely adopted flying platforms pose safety hazards: unprotected high-speed rotating propellers can cause harm to the surroundings. Additionally, the space for carrying payloads on the fuselage is limited, and the restricted position of the payload also hinders efficient grasping. To address these issues, this paper presents a coaxial ducted fan UAV which is equipped with electromagnets mounted externally on the fuselage, enabling safe grasping and transfer of multiple loads in midair without complex additional actuators. It also has the capability to achieve direct human-UAV cargo transfer in the air. The forces acting on the loads during magnetic attachment and their influencing factors were analyzed. An ADRC controller is utilized to counteract disturbances during grasping and achieve attitude control. Finally, flight tests are conducted to verify the UAV's ability to directly grasp multiple loads from human hands in flight while maintaining attitude tracking.


Morphology and Behavior Co-Optimization of Modular Satellites for Attitude Control

Wang, Yuxing, Li, Jie, Yu, Cong, Li, Xinyang, Huang, Simeng, Chang, Yongzhe, Wang, Xueqian, Liang, Bin

arXiv.org Artificial Intelligence

The emergence of modular satellites marks a significant transformation in spacecraft engineering, introducing a new paradigm of flexibility, resilience, and scalability in space exploration endeavors. In addressing complex challenges such as attitude control, both the satellite's morphological architecture and the controller are crucial for optimizing performance. Despite substantial research on optimal control, there remains a significant gap in developing optimized and practical assembly strategies for modular satellites tailored to specific mission constraints. This research gap primarily arises from the inherently complex nature of co-optimizing design and control, a process known for its notorious bi-level optimization loop. Conventionally tackled through artificial evolution, this issue involves optimizing the morphology based on the fitness of individual controllers, which is sample-inefficient and computationally expensive. In this paper, we introduce a novel gradient-based approach to simultaneously optimize both morphology and control for modular satellites, enhancing their performance and efficiency in attitude control missions. Our Monte Carlo simulations demonstrate that this co-optimization approach results in modular satellites with better mission performance compared to those designed by evolution-based approaches. Furthermore, this study discusses potential avenues for future research.